44 research outputs found

    DREAM3: Network Inference Using Dynamic Context Likelihood of Relatedness and the Inferelator

    Get PDF
    Many current works aiming to learn regulatory networks from systems biology data must balance model complexity with respect to data availability and quality. Methods that learn regulatory associations based on unit-less metrics, such as Mutual Information, are attractive in that they scale well and reduce the number of free parameters (model complexity) per interaction to a minimum. In contrast, methods for learning regulatory networks based on explicit dynamical models are more complex and scale less gracefully, but are attractive as they may allow direct prediction of transcriptional dynamics and resolve the directionality of many regulatory interactions.We aim to investigate whether scalable information based methods (like the Context Likelihood of Relatedness method) and more explicit dynamical models (like Inferelator 1.0) prove synergistic when combined. We test a pipeline where a novel modification of the Context Likelihood of Relatedness (mixed-CLR, modified to use time series data) is first used to define likely regulatory interactions and then Inferelator 1.0 is used for final model selection and to build an explicit dynamical model.Our method ranked 2nd out of 22 in the DREAM3 100-gene in silico networks challenge. Mixed-CLR and Inferelator 1.0 are complementary, demonstrating a large performance gain relative to any single tested method, with precision being especially high at low recall values. Partitioning the provided data set into four groups (knock-down, knock-out, time-series, and combined) revealed that using comprehensive knock-out data alone provides optimal performance. Inferelator 1.0 proved particularly powerful at resolving the directionality of regulatory interactions, i.e. "who regulates who" (approximately of identified true positives were correctly resolved). Performance drops for high in-degree genes, i.e. as the number of regulators per target gene increases, but not with out-degree, i.e. performance is not affected by the presence of regulatory hubs

    DREAM4: Combining Genetic and Dynamic Information to Identify Biological Networks and Dynamical Models

    Get PDF
    Current technologies have lead to the availability of multiple genomic data types in sufficient quantity and quality to serve as a basis for automatic global network inference. Accordingly, there are currently a large variety of network inference methods that learn regulatory networks to varying degrees of detail. These methods have different strengths and weaknesses and thus can be complementary. However, combining different methods in a mutually reinforcing manner remains a challenge.We investigate how three scalable methods can be combined into a useful network inference pipeline. The first is a novel t-test-based method that relies on a comprehensive steady-state knock-out dataset to rank regulatory interactions. The remaining two are previously published mutual information and ordinary differential equation based methods (tlCLR and Inferelator 1.0, respectively) that use both time-series and steady-state data to rank regulatory interactions; the latter has the added advantage of also inferring dynamic models of gene regulation which can be used to predict the system's response to new perturbations.Our t-test based method proved powerful at ranking regulatory interactions, tying for first out of methods in the DREAM4 100-gene in-silico network inference challenge. We demonstrate complementarity between this method and the two methods that take advantage of time-series data by combining the three into a pipeline whose ability to rank regulatory interactions is markedly improved compared to either method alone. Moreover, the pipeline is able to accurately predict the response of the system to new conditions (in this case new double knock-out genetic perturbations). Our evaluation of the performance of multiple methods for network inference suggests avenues for future methods development and provides simple considerations for genomic experimental design. Our code is publicly available at http://err.bio.nyu.edu/inferelator/

    Search for dark matter produced in association with bottom or top quarks in √s = 13 TeV pp collisions with the ATLAS detector

    Get PDF
    A search for weakly interacting massive particle dark matter produced in association with bottom or top quarks is presented. Final states containing third-generation quarks and miss- ing transverse momentum are considered. The analysis uses 36.1 fb−1 of proton–proton collision data recorded by the ATLAS experiment at √s = 13 TeV in 2015 and 2016. No significant excess of events above the estimated backgrounds is observed. The results are in- terpreted in the framework of simplified models of spin-0 dark-matter mediators. For colour- neutral spin-0 mediators produced in association with top quarks and decaying into a pair of dark-matter particles, mediator masses below 50 GeV are excluded assuming a dark-matter candidate mass of 1 GeV and unitary couplings. For scalar and pseudoscalar mediators produced in association with bottom quarks, the search sets limits on the production cross- section of 300 times the predicted rate for mediators with masses between 10 and 50 GeV and assuming a dark-matter mass of 1 GeV and unitary coupling. Constraints on colour- charged scalar simplified models are also presented. Assuming a dark-matter particle mass of 35 GeV, mediator particles with mass below 1.1 TeV are excluded for couplings yielding a dark-matter relic density consistent with measurements

    Measurement of jet fragmentation in Pb+Pb and pppp collisions at sNN=2.76\sqrt{{s_\mathrm{NN}}} = 2.76 TeV with the ATLAS detector at the LHC

    Get PDF

    Error as a function of binned median expression for all regulatory interactions.

    No full text
    <p>We further investigate the relationship between the median expression of the regulators and each pipeline's performance in predicting topology. We use relative rank as an estimate of error (as in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0013397#pone-0013397-g003" target="_blank">Figure 3</a>). We bin the regulators for all five networks based on their median expression (each of the seven bins has a roughly equal number of regulators). We show the distribution of relative ranks (Error) for each pipeline in each bin of regulator expression. We see that all of the pipelines that incorporate the predictions of tlCLR-Inferelator (pipelines 2,3, and 4) outperform MCZ for regulators with low median expression (bins , ).</p

    Trends in performance over the five networks.

    No full text
    <p>For panels A,B,C we consider only the performance of MCZ, and use relative rank as an estimate of error. We compute relative rank in the following way. Denote by the total number of possible regulatory interactions, and by the rank that was given to each regulatory interaction, . The relative rank of is defined to be . Error distributions of the predictions for the five networks are shown as black boxplots in panels A,B,C. Distributions of in-degree of the regulators, out-degree of the regulators, and median expression of the regulators are shown as gray boxplots in panels A,B,C, respectively. <b>A</b>) There is no apparent relationship between relative rank (Error) and the in-degree of the regulators. <b>B</b>) There is no apparent relationship between relative rank (Error) and the out-degree of the regulators. <b>C</b>) Relative rank (Error) in network prediction increases as the median expression of the regulators decreases. <b>D</b>) we show the relationship between median expression of the regulators and the performance in ranking regulatory interactions, in terms of AUPR, across all five networks. For MCZ a correlation of () exists between the TFs median expression and AUPR (shown in red), while for Resampling+MCZ there is a smaller correlation of () between the TFs median expression and AUPR (shown in purple).</p

    Performance on double knock-out prediction.

    No full text
    <p>We assess the accuracy of predicting the system's response to the simultaneous removal (knock-out) of two genes . In total, there were one-hundred pairs of genes that were knocked out. We bin these pairs of genes based on the average of their respective median expression in the single-gene knock-out data. We made two predictions, which differ only in the choice of initial conditions. We compare the error (as evaluated by the mean squared error) of our prediction to the error made by using the respective initial condition as a prediction. <b>A</b>) We use the wild-type expression, , as the set of initial conditions (green boxplots). We see that our predictions (black and red boxplots) are more accurate than if we used the initial conditions as a prediction (this is more apparent for TFs with a larger median expression). <b>B</b>) We use a combination of the single-gene knock-outs to compute our initial conditions (eq. 25). We do this because the single-gene knock-out data represents a system state that is closer to the state we are trying to predict than wild-type (as can be observed by comparing the green boxplots in panel A to those in panel B). We show the error distributions using parameters calculated by either pipeline 3 (tlCLR-Inferelator+MCZ) or pipeline 4 (Resampling+MCZ), gray and red boxplots, respectively, are smaller than the error distributions if we used the initial conditions as a prediction. Regardless of the choice of initial conditions, the error distributions using parameters calculated by pipeline 4 (red boxplots) are similar to the error distribution obtained by pipeline 3.</p

    Area under precision recall curve for each ranking scheme.

    No full text
    <p>For each pipeline we evaluated the performance in predicting topology using area under the precision recall curve (AUPR). We see that pipeline 4 generally outperforms all other methods, followed by MCZ, pipeline 3, and pipeline 2.</p
    corecore